AITopics | complexity metric

Collaborating Authors

complexity metric

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

51a472c08e21aef54ed749806e3e6490-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 21:58:41 GMT

information, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.72)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

51a472c08e21aef54ed749806e3e6490-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 16:15:54 GMT

Another possible reason is that it is unclear if the low signal-to-noise ratio of neuroimaging tools such as functional Magnetic Resonance Imaging (fMRI) can allow us to reveal the correlates of complex (and perhaps subtle) syntactic representations.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(2 more...)

Genre: Research Report (0.47)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.90)
Health & Medicine > Diagnostic Medicine > Imaging (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

How Ensemble Learning Balances Accuracy and Overfitting: A Bias-Variance Perspective on Tabular Data

Mohammad, Zubair Ahmed

arXiv.org Artificial IntelligenceDec-8-2025

Abstract--Tree-based ensemble methods consistently outperform single models on tabular classification tasks, yet the conditions under which ensembles provide clear advantages--and prevent overfitting despite using high-variance base learners--are not always well understood by practitioners. We study four real-world classification problems (Breast Cancer diagnosis, Heart Disease prediction, Pima Indians Diabetes, and Credit Card Fraud detection) comparing classical single models against nine ensemble methods using five-seed repeated stratified cross-validation with statistical significance testing. Our results reveal three distinct regimes: (i) On nearly linearly separable data (Breast Cancer), well-regularized linear models achieve 97% accuracy with <2% generalization gaps; ensembles match but do not substantially exceed this performance. We systematically quantify dataset complexity through linearity scores, feature correlation, class separability, and noise estimates, explaining why different data regimes favor different model families. Cross-validated train/test accuracy and generalization-gap plots provide simple visual diagnostics for practitioners to assess when ensemble complexity is warranted. Statistical testing confirms that ensemble gains are significant on nonlinear tasks (p < 0.01) but not on near-linear data (p > 0.15). The study provides actionable guidelines for ensemble model selection in high-stakes tabular applications, with full code and reproducible experiments publicly available. A model that almost perfectly fits its training data can still fail badly on new cases. This gap between training performance and real-world behaviour is the essence of overfitting, and it is particularly problematic in domains such as medical diagnosis and financial fraud detection, where mistakes are costly: missed tumours delay treatment, and undetected fraud translates directly into monetary loss.

accuracy, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2512.05469

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.90)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Exploring Complexity Changes in Diseased ECG Signals for Enhanced Classification

Quintero, Camilo Quiceno, George, Sandip Varkey

arXiv.org Artificial IntelligenceOct-22-2025

The complex dynamics of the heart are reflected in its electrical activity, captured through electrocardiograms (ECGs). In this study we use nonlinear time series analysis to understand how ECG complexity varies with cardiac pathology. Using the large PTB-XL dataset, we extracted nonlinear measures from lead II ECGs, and cross-channel metrics (leads II, V2, A VL) using Spearman correlations and mutual information. Significant differences between diseased and healthy individuals were found in almost all measures between healthy and diseased classes, and between 5 diagnostic superclasses (p < .001). Moreover, incorporating these complexity quantifiers into machine learning models substantially improved classification accuracy measured using area under the ROC curve (AUC) from 0.86 (baseline) to 0.87 (nonlinear measures) and 0.90 (including cross-time series metrics).

artificial intelligence, classification, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.1781

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (0.36)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Type and Complexity Signals in Multilingual Question Representations

Kokot, Robin, Poelman, Wessel

arXiv.org Artificial IntelligenceOct-9-2025

This work investigates how a multilingual transformer model represents morphosyntactic properties of questions. We introduce the Question Type and Complexity (QTC) dataset with sentences across seven languages, annotated with type information and complexity metrics including dependency length, tree depth, and lexical density. Our evaluation extends probing methods to regression labels with selectivity controls to quantify gains in generalizability. We compare layer-wise probes on frozen Glot500-m (Imani et al., 2023) representations against subword TF-IDF baselines, and a fine-tuned model. Results show that statistical features classify questions effectively in languages with explicit marking, while neural probes capture fine-grained structural complexity patterns better. We use these results to evaluate when contextual representations outperform statistical baselines and whether parameter updates reduce the availability of pre-trained linguistic information.

computational linguistic, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.06304

Country:

North America > United States (0.46)
Europe > Belgium > Flanders (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
(2 more...)

Add feedback

A Metrics-Oriented Architectural Model to Characterize Complexity on Machine Learning-Enabled Systems

Ferreira, Renato Cordeiro

arXiv.org Artificial IntelligenceAug-13-2025

--How can the complexity of ML-enabled systems be managed effectively? The goal of this research is to investigate how complexity affects ML-Enabled Systems (MLES). T o address this question, this research aims to introduce a metrics-based architectural model to characterize the complexity of MLES. The goal is to support architectural decisions, providing a guideline for the inception and growth of these systems. This paper showcases the first step for creating the metrics-based architectural model: an extension of a reference architecture that can describe MLES to collect their metrics.

artificial intelligence, complexity, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/CAIN66642.2025.00041

2506.08153

Country: South America > Brazil (0.14)

Genre:

Workflow (0.92)
Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Architecture (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.32)

Add feedback

Mass-Scale Analysis of In-the-Wild Conversations Reveals Complexity Bounds on LLM Jailbreaking

Creo, Aldan, Fernandez, Raul Castro, Cebrian, Manuel

arXiv.org Artificial IntelligenceJul-14-2025

As large language models (LLMs) become increasingly deployed, understanding the complexity and evolution of jailbreaking strategies is critical for AI safety. We present a mass-scale empirical analysis of jailbreak complexity across over 2 million real-world conversations from diverse platforms, including dedicated jailbreaking communities and general-purpose chatbots. Using a range of complexity metrics spanning probabilistic measures, lexical diversity, compression ratios, and cognitive load indicators, we find that jailbreak attempts do not exhibit significantly higher complexity than normal conversations. This pattern holds consistently across specialized jailbreaking communities and general user populations, suggesting practical bounds on attack sophistication. Temporal analysis reveals that while user attack toxicity and complexity remains stable over time, assistant response toxicity has decreased, indicating improving safety mechanisms. The absence of power-law scaling in complexity distributions further points to natural limits on jailbreak development. Our findings challenge the prevailing narrative of an escalating arms race between attackers and defenders, instead suggesting that LLM safety evolution is bounded by human ingenuity constraints while defensive measures continue advancing. Our results highlight critical information hazards in academic jailbreak disclosure, as sophisticated attacks exceeding current complexity baselines could disrupt the observed equilibrium and enable widespread harm before defensive adaptation.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2507.08014

Country: Europe > Spain (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.95)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Enhancing LLM-Based Code Generation with Complexity Metrics: A Feedback-Driven Approach

Sepidband, Melika, Taherkhani, Hamed, Wang, Song, Hemmati, Hadi

arXiv.org Artificial IntelligenceJun-2-2025

--Automatic code generation has gained significant momentum with the advent of Large Language Models (LLMs) such as GPT -4. Although many studies focus on improving the effectiveness of LLMs for code generation, very limited work tries to understand the generated code's characteristics and leverage that to improve failed cases. In this paper, as the most straightforward characteristic of code, we investigate the relationship between code complexity and the success of LLMgenerated code. Using a large set of standard complexity metrics, we first conduct an empirical analysis to explore their correlation with LLM's performance on code generation (i.e., Pass@1). Using logistic regression models, we identify which complexity metrics are most predictive of code correctness. Building on these findings, we propose an iterative feedback method, where LLMs are prompted to generate correct code based on complexity metrics from previous failed outputs. Experiment results show that our approach makes notable improvements, particularly with a smaller LLM (GPT - 3.5 T urbo), where, e.g., Pass@1 increased by 35.71% compared to the baseline's improvement of 12.5% on the HumanEval dataset. The study expands experiments to BigCodeBench and integrates the method with the Reflexion code generation agent, leading to Pass@1 improvements of 20% (GPT -4o) and 23.07% The results highlight that complexity-aware feedback enhances both direct LLM prompting and agent-based workflows. Automatic code generation aims to reduce manual coding and boost productivity [1], with LLMs like GPT -4 [2] making significant advancements. However, ensuring accuracy and correctness remains a challenge. Recently, several approaches have been proposed to enhance LLM-based code generation.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.23953

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

C2RUST-BENCH: A Minimized, Representative Dataset for C-to-Rust Transpilation Evaluation

Sirlanci, Melih, Yagemann, Carter, Lin, Zhiqiang

arXiv.org Artificial IntelligenceApr-22-2025

Despite the effort in vulnerability detection over the last two decades, memory safety vulnerabilities continue to be a critical problem. Recent reports suggest that the key solution is to migrate to memory-safe languages. To this end, C-to-Rust transpilation becomes popular to resolve memory-safety issues in C programs. Recent works propose C-to-Rust transpilation frameworks; however, a comprehensive evaluation dataset is missing. Although one solution is to put together a large enough dataset, this increases the analysis time in automated frameworks as well as in manual efforts for some cases. In this work, we build a method to select functions from a large set to construct a minimized yet representative dataset to evaluate the C-to-Rust transpilation. We propose C2RUST-BENCH that contains 2,905 functions, which are representative of C-to-Rust transpilation, selected from 15,503 functions of real-world programs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.15144

Country: North America > United States (0.68)

Genre: Research Report (0.84)

Industry:

Information Technology (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)

Add feedback

Uncovering Fairness through Data Complexity as an Early Indicator

Ferreira, Juliett Suárez, Slavkovik, Marija, Casillas, Jorge

arXiv.org Artificial IntelligenceApr-9-2025

Fairness constitutes a concern within machine learning (ML) applications. Currently, there is no study on how disparities in classification complexity between privileged and unprivileged groups could influence the fairness of solutions, which serves as a preliminary indicator of potential unfairness. In this work, we investigate this gap, specifically, we focus on synthetic datasets designed to capture a variety of biases ranging from historical bias to measurement and representational bias to evaluate how various complexity metrics differences correlate with group fairness metrics. We then apply association rule mining to identify patterns that link disproportionate complexity differences between groups with fairness-related outcomes, offering data-centric indicators to guide bias mitigation. Our findings are also validated by their application in real-world problems, providing evidence that quantifying group-wise classification complexity can uncover early indicators of potential fairness challenges. This investigation helps practitioners to proactively address bias in classification tasks.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2504.05923

Country:

North America > United States (0.46)
Europe (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback